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I  Introduction 


« 


There  have  been  produced  many  combinations  of  a  method  and  an 
approach  for  estimating  the  operating  characteristics  of  graded 
item  responses  (Samejima,  1972),  which  have  two  distinguishing 
characteristics  such  that: 

(1)  No  prior  mathematical  forms  are  assumed  for  the  resulting 
operating  characteristics, 

and: 

(2)  A  relatively  small  number  of  subjects,  say,  several  hundred,  are 
needed  for  the  basic  data  for  the  estimation. 

(cf.  Samejima,  1977c,  1977d,  1978a,  1978b,  1978c,  1978d,  1978e,  1978f.) 
Ue  can  categorize  these  methods  and  approaches  as  follows. 

[A]  Approaches: 

(i)  Histogram  Ratio  Approach 

(ii)  Curve  Fitting  Approach 

(iii)  Conditional  P.D.F.  Approach 

(a)  Simple  Sum  Procedure 

(b)  Weighted  Sum  Procedure 

(c)  Proportioned  Sum  Procedure 

(iv)  Bivariate  P.D.F.  Approach 

[B]  Methods: 


(i)  Two-Parameter  Beta  Method 
(ii)  Pearson  System  Method 
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(iii)  Normal  Approach  Method 

It  has  been  found  out  that  all  of  these  combinations  of  an  approach  and 
a  method  provide  us  with  good  estimations  of  operating  characteristics, 
although  each  combination  has  its  own  merits  as  well  as  its  relative 
shortcomings  when  compared  with  the  other  combinations. 

These  combinations  of  a  method  and  an  approach  have  also  such 
additional  characteristics  that: 

(3)  We  need  a  set  of  items  whose  operating  characteristics  are 
known,  in  order  to  estimate  the  operating  characteristics  of 
"unknown"  items; 

and 

(4)  Such  a  set  of  "known"  items,  which  is  called  Old  Test,  must 
provide  us  with  a  substantially  large  and  constant  amount  of  test 
information  for  the  interval  of  latent  trait  of  our  interest . 

A  typical  situation  which  possesses  these  characteristics  in  itself 
is  the  tailored  testing  situation,  where  we  have  an  item  pool  from 
which  an  optimal  subset  of  test  items  is  selected  and  presented  to  a 
specific  examinee.  When  we  wish  to  add  new  items  to  the  item  pool, 
all  we  need  is  to  use  a  fixed  amount  of  test  information  as  the 
criterion  for  terminating  the  presentation  of  new  items  to  every 
individual  subject  (cf.  1977a,  1977b).  Thus  Old  Test  in  this  situation 
is  not  a  single  set  of  test  items,  but  a  combination  of  as  many 
subtests  as  the  number  of  examinees  who  provided  us  with  the  basic 


data  for  the  estimation  of  the  operating  characteristics.  We  notice 
that,  though  these  features,  (3)  and  (4),  are  suitable  in  the 
tailored  testing  situation,  they  will  restrict  the  applicability  of 
the  estimation  methods  in  the  paper-and-pencil  testing  situation, 
where  we  are  forced  to  use  a  fixed  set  of  test  items. 

In -some  situations,  efforts  have  been  put  upon  the  elimination 
of  feature  (3)  using  equivalent  items  and  Constant  Information  Model, 
a  new  family  of  models,  and  so  forth,  so  that  we  shall  be  able  to  use 
the  methods  without  depending  upon  the  Old  Test  (cf.  Samejima,  1979a, 
1979b,  1979c).  We  note,  however,  that,  even  if  we  may  have  to  depend 
upon  the  Old  Test  in  estimating  the  operating  characteristics  of  "new 
items,"  the  applicability  of  the  methods  will  be  enhanced  enormously 
under  any  circumstances,  if  we  can  eliminate  the  requirement  of 
the  constant  test  information,  which  is  stated  in  (4),  i.e.,  if.  can 
use  a  set  of  "known"  items  whose  test  information  function  is  not 


constant  for  the  interval  of  ability  of  our  interest,  as  Old  Test. 
Fortunately,  this  expansion  of  the  methods  is  relatively  easy  and 
straight-forward,  at  least,  in  theory. 

In  the  present  paper,  the  rationale  behind  this  generalization 
of  the  methods  will  be  presented  and  discussed.  In  so  doing,  the 
transformation-free  character  of  the  maximum  likelihood  estimator 
(Samejima,  1969)  takes  an  essential  role.  The  method  of  moments 
for  fitting  a  polynomial,  which  proved  to  be  also  the  least  squares 
solution  (Samejima  and  Livingston,  1979),  plays  another  important  role. 


The  procedures  presented  in  this  paper  will  be  applied  in  the 
simulation  study  in  the  near  future,  and  will  be  published  as  separate 
papers,  in  order  to  investigate  how  the  theory  works  in  practice. 


II  Transformation  of  Latent  Trait 

Let  0  be  the  latent  trait,  or  ability,  which  assumes  any 
real  number,  such  that 

(2.1)  -00  <  0  <  oo  # 


Let  g  (=1,2, . . . ,n)  be  an  item,  and  x  (=0,1,..., m  )  be  a  graded  item 

&  S 

response  (Samejima,  1969,  1972),  which  is  reduced  to  the  binary  item 

response  when  tn  =1  .  The  operating  characteristic  of  the  graded  item 

response  is  denoted  by  P  (0)  ,  which  is  the  conditional  probability 

Xg 

with  which  the  examinee  obtains  the  item  score,  or  provides  us  with 

the  graded  item  response,  x  ,  given  ability  0  .  Two  typical 

examples  of  this  operating  characteristic  are  those  in  the  normal  ogive 

model  and  in  the  logistic  model,  defined  on  the  graded  response  level 

(Samejima,  1972).  The  item  response  information  function,  I  (0)  , 

g 

is  defined  as  the  negative  of  the  second  partial  derivative  of  the 
natural  logarithm  of  the  operating  characteristic,  such  that 


(2.2)  I  (0)  =  -  log  P  (0)  , 

8  36Z  Xg 


and  the  item  information  function  is  the  regression  of  the  item 
response  information  function  on  ability  0  ,  which  can  be  written  as 


(2.3)  I  (0)  =  L8  I  (0)  p  (0)  . 

v°  Xg  Xg 


This  item  information  function  can  be  considered  as  an  index  of  local 


accuracy  of  estimation  of  6  provided  by  the  item  g  ,  if  the  item 
response  information  function  assumes  a  positive  value  for  every 
item  response  (Samejima,  1973b),  as  is  the  case  of  the  normal 

ogive  and  the  logistic  models  on  the  graded  response  level  (cf. 
Samejima,  1969,  1972,  1973a). 

Let  V  be  the  response  pattern  of  the  graded  item  responses, 
such  that 

(2.4)  V  =  (x  ,  x  ,  . . . ,  x  ) *  . 

i  z  n 

The  operating  characteristic  of  the  response  pattern  V  ,  which  is 
the  conditional  probability  with  which  the  examinee  obtains  the 
response  pattern  V  ,  given  6  ,  and  is  denoted  by  PyO)  >  can  be 
written,  in  virtue  of  the  assumption  of  local  independence  (Lord  and 
Novick,  1968) ,  by  the  formula 

(2.5)  P  (0)  =  II  Fx  (6)  , 

xgeV  S 

and  the  response  pattern  information  function,  1^(0)  ,  is  the  negat 
of  the  second  partial  derivative  of  the  natural  logarithm  of  the 
operating  characteristic  of  the  response  pattern,  such  that 
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The  test  information  function,  1(6)  ,  is  defined  as  the  regression 
of  the  response  pattern  information  function  on  ability  6  ,  such 
that 

(2.7)  1(6)  =  I  1  (6)  P  (6)  . 

V  V 

It  has  been  shown  both  on  the  dichotomous  and  the  graded  response 
levels  that  this  test  information  function  can  be  written  as  the 
sum  total  of  the  item  information  functions,  such  that 

n 

(2.8)  1(0)  =  l  Ig(0) 

g=l 

(Birnbaum,  1968;  Samejima,  1969).  We  can  prove  from  (2.3)  that  the 
item  information  function  is  non-negative  in  nature,  regardless  of 
the  values  of  the  item  response  information  functions.  By  virtue 
of  (2.8),  therefore,  the  test  information  function,  1(0)  ,  is  also 
non-negative  in  nature,  and  is  used  as  an  index  of  local  accuracy 
of  estimation  of  ability  0  provided  by  the  test.  Note,  however,  that 
this  index  is  meaningless  unless  the  item  response  information  function 
assumes  a  non-negative  value  for  every  item  response  xg  ,  since, 
otherwise,  the  existence  of  the  unique  maximum  likelihood  estimate 
is  not  assured  for  every  possible  response  pattern,  as  is  the  case 
in  the  three-parameter  normal  ogive  and  logistic  models  (cf.  Samejima, 
1969,  1972,  1973b). 

Let  t  be  a  function  of  0  ,  such  that 


m 


i, 


a* 


It  can  be  seen  that,  with  the  response  pattern  V  ,  we  obtain 


similar  results,  such  that 

(2.14)  P*(x)  -P*It(0)]  *  Py(0) 

for  the  operating  characteristic,  P*(x)  ,  and 

(2.15)  1*(T)  -  ye)  [ff]2  -  log  Pv(0). 

dx 

for  the  information  function,  I *(x)  .  We  can  write  for  the  test 
information  function  I*(x)  either  from  (2.15)  or  from  (2.12)  such 
that 

(2.16)  1*(t)  -  1(0)  [||]2 

and,  since  x  is  a  strictly  increasing  function  of  0  ,  we  have 

(2.17)  [I*(x)]1/2  -  U(0)]1/2  . 

The  maximum  likelihood  estimate,  0  ,  of  ability  6  ,  which 
is  based  upon  the  response  pattern  V  ,  can  be  obtained  by  using  the 
operating  characteristics  Py(0)  as  the  likelihood  function.  In  a 
similar  manner,  the  corresponding  maximum  likelihood  estimate,  x  , 
can  be  obtained  by  using  P*(x)  as  the  likelihood  function.  By  virtue 
of  the  transformation-free  character  of  the  maximum  likelihood  estimator, 
however,  this  second  maximum  likelihood  estimate  can  also  be  obtained  by 
the  direct  transformation  of  0  ,  such  that 
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I 


J 


(2.18)  t  -  t(6) 

(cf.  Samejima,  1969). 

Note  that  (2.18)  has  a  great  deal  of  practical  importance, 

especially  when  the  transformation,  t(  )  ,  is  given  by  a 

relatively  simple  formula.  Since  in  most  cases  there  exists  no 

sufficient  statistic  for  the  response  pattern  V  ,  the  maximum 

likelihood  estimate,  t  ,  must  be  obtained  through  a  numerical 

process,  using  the  basic  function  A*  ( x )  ,  which  is  defined  by 

g 

(2.19)  A*  (t)  =  log  P*  (T) 

g  Xg 

(cf.  Samejima,  1969,  1972).  Substituting  (2.10)  into  (2.19), 
we  can  write 


(2.20) 


A*  (t) 
x 

g 


px 


*  4—  a  (e) 

dx  x 

g 


where  A  (9)  is  the  basic  function  of  the  item  response  x 

XB  d0  8 

defined  with  respect  to  0  .  Since  the  derivative,  ,  is  usually 

of  a  complicated  form,  it  is  not  easy  to  program  the  process  so  that 

we  shall  be  able  to  obtain  the  maximum  likelihood  estimate  t  as 

the  solution  to  the  equation. 


(2.21) 


I 

x  eV 
g 


A*  (T) 
g 


0  . 


It  is  much  easier,  therefore,  to  obtain  the  maximum  likelihood  6 
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III  Latent  Trait  Providing  £  Constant  Test  Information  for  £ 
Specific  Test 

Here  we  assume  that  the  test  information  function,  1(8)  , 
of  a  specific  test  of  our  interest  is  not  constant  for  the  interval 
[6,  6]  .  We  attempt  to  transform  the  latent  trait  0  to  t  ,  in 
such  £  way  that  the  resultant  test  information  function,  I*(t)  , 
be  constant  for  the  interval.  [t,  t],  where 


(3.1) 


t  -  t(8) 

,  T  -  T(0)  . 


Let  C2  denote  this  desired,  constant  amount  of  test  information. 
From  (2.17)  we  can  write 

(3.2)  -  C"1  [1(6) ]1/2  . 

Now  we  obtain  from  (3.2)  for  the  transformation  of  6  to  t 


(3.3) 


-1/ 


IK8)] 


1/2 


d6  +  d 


where  d  is  an  arbitrary  constant. 

Thus  it  has  been  shown  that,  as  far  as  the  square  root  of  test 
information  function  is  integrable,  we  can  always  transform  the  latent 
trait  6  to  another  scale,  t  ,  by  means  of  (3.3),  in  such  a  way 

that  the  resultant  test  information,  I*(t)  ,  be  constant.  A  problem 

1/2 

arises,  however,  when  (1(0)]  is  not  integrable,  or  its  integral 
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pro  vide  s  us  with  a  highly  complicated  form,  as  is  usually  the  case. 
Perhaps  the  best  practical  solution  for  this  problem  is  the  use  of 
the  method  of  moments. 

It  has  been  shown  by  Samejima  and  Livingston  (Samejima  and 
Livingston,  1979)  that  the  polynomial  provided  by  the  method  of 
moments  to  approximate  any  given  function  is  also  its  least  squares 
solution,  which  is  an  appropriate  characteristic  for  the  present 
purpose.  It  has  also  been  demonstrated  that,  in  fitting  such  a 
polynomial,  it  is  important  to  find  an  optimal  interval  of  the 
independent  variable  for  the  computation  of  the  moments  in  order  to 
obtain  a  well-fitted  function.  If  we  succeed  in  obtaining  such  a 
polynomial,  we  can  write 


(3.4) 

1/2  m 
[i(e)l  /z  -  i 

k-0 

where 

k  is  the  degree  of  i 

(3.3), 

we  obtain 

(3.5) 

-1  m 

t  -  C  Z  a, 

k=0  k 

+  d 


111+1  *  k 

la0, 

k=0 


where 


“k 


a 


k-1 


k  -  0 

k  *  1,2, .. . ,m+l 


(3.6) 


The  transformation  of  0  to  t  can  be  made,  therefore,  through  a 


polynomial  of  degree  (nri-l),  which  is  quite  simple. 

For  the  purpose  of  illustration,  we  hypothesize  two  tests, 
whose  test  information  functions  are  not  constant .  Each  of  these 
two  tests  consists  of  twenty-five  graded  test  items  with  m^  ■  2  . 
Since  they  are  both  subsets  of  the  thirty-five  test  items  of  Old 
Test  used  in  the  previous  studies,  we  shall  call  them  Subtests  1  and 
2,  respectively.  All  these  test  items  follow  the  normal  ogive  model, 
whose  operating  characteristics  are  given  by 

-1/2  TV6"1**  > 

(3.7)  P  (0)  -  [2k]  8  8  exp[-u2/2]  du 

8  J  “g<9'bV1> 

where  a  (>0)  is  the  item  discrimination  parameter  and  b..  is 

g  xg 

the  item  response  difficulty  parameter,  which  satisfies 


(3.8) 


b„  <  b. 


<  b  <  b 


Vi 


These  item  parameters  are  shown  in  Tables  3-1  and  3-2. 

The  item  information  function,  I  (0)  ,  for  each  item  of 

8 

Subtests  1  and  2  was  obtained  through  (3.7),  (2.2)  and  (2.3),  and 
the  two  test  information  functions,  1(0)  ,  were  obtained  through 
(2.8).  Figures  3-1  and  3-2  present  the  square  roots  of  the  test 
information  functions  thus  obtained  by  solid  curves,  for  Subtests 
1  and  2,  respectively. 

Taking  0  ■  -3.0  and  0  ■  3.0  ,  the  moments  about  the 


i 
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TABLE  3-1 

Item  Discrimination  Parameters  of  the  Twenty-Five 
Items  of  Each  of  Subtests  1  and  2 


Item  g 

— 

a 

g 

Subtest  1 

Subtest  2 

1 

1.8 

X 

2 

1.9 

X 

3 

2.0 

X 

4 

1.5 

X 

5 

1.6 

X 

6 

1.4 

X 

X 

7 

1.9 

X 

X 

8 

1.8 

X 

X 

9 

1.6 

X 

X 

10 

2.0 

X 

X 

11 

KQ 

X 

X 

12 

mam- 

X 

X 

13 

X 

14 

IKK 

X 

15 

2.0 

X 

16 

1.6 

X 

17 

1.8 

X 

18 

X 

19 

|$|!||| 

X 

20 

-  - 

X 

21 

BKS 

X 

22 

X 

23 

X 

X 

24 

X 

X 

25 

2.0 

X 

X 

26 

1.6 

X 

X 

27 

1.7 

X 

X 

28 

1.4 

X 

X 

29 

1.9 

X 

X 

30 

1.6 

X 

X 

31 

1.5 

X 

32 

1.7 

X 

33 

1.8 

X 

34 

X 

35 

1.4 

X 

J 
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TABLE  3-2 

Two  Item  Difficulty  Parameters  of  Each  Item  of 
Subtests  1  and  2 


Item  g 

bl 

b2 

Subtest  1 

Subtest  2 

1 

-4.75 

-3.75 

X 

2 

-4.50 

-3.50 

X 

3 

-4.25 

-3.25 

X 

4 

-4.00 

-3.00 

X 

5 

-3.75 

-2.75 

X 

6 

-3.50 

-2.50 

X 

X 

7 

-3.00 

-2.00 

X 

X 

8 

-3.00 

-2.00 

X 

X 

9 

-2.75 

-1.75 

X 

X 

10 

-2.50 

-1.50 

X 

X 

11 

-2.25 

-1.25 

X 

X 

12 

-2.00 

-1.00 

X 

X 

13 

-1.75 

-0.75 

X 

14 

-1.50 

-0.50 

X 

15 

-1.25 

-0.25 

X 

16 

-1.00 

0.00 

X 

17 

-0.75 

0.25 

X 

18 

-0.50 

0.50 

X 

19 

-0.25 

0.75 

X 

20 

0.00 

1.00 

X 

21 

0.25 

1.25 

X 

22 

0.50 

1.50 

X 

23 

0.75 

1.75 

X 

X 

24 

1.00 

2.00 

X 

X 

25 

1.25 

2.25 

X 

X 

26 

1.50 

2.50 

X 

X 

27 

1.75 

2.75 

X 

X 

28 

2.00 

3.00 

X 

X 

29 

2.25 

3.25 

X 

X 

30 

2.50 

3.50 

X 

X 

31 

2.75 

3.75 

X 

32 

3.00 

4.00 

X 

33 

3.25 

4.25 

X 

34 

3.50 

4.50 

X 

35 

3.75 

4.75 

X 

FIGURE  3-1  (Continued):  Subtest  1,  Polynomial  of  Degree 


FIGURE  3-1  (Continued):  Subtest  1,  Polynomial  of  Degree 


FIGURE  3-1  (Continued):  Subtest  1,  Polynomial  of  Degree 


re  Root  of  the  Test  Information  Function,  [1(0)]  ,  (Solid  Line)  and 

Polynomial  of  Degree  3  (Dotted  Line),  Which  Was  Fitted  by  the  Method 
of  Moments  with  [-3.0,  3.0]  As  the  Interval  of  0  . 


FIGURE  3-2  (Continued):  Subtest  2,  Polynomial  of  Degree 


FIGURE  3-2  (Continued):  Subtest  2,  Polynomial  of  Degree 


FIGURE  3-2  (Continued):  Subtest  2,  Polynomial  of  Degree 


origin,  vi*'  ,  which  are  given  by 

(3.9)  p*’  *  /  er[I(0)]1/2  de  ,  r»0, 1,2,3, ... ,m  , 

r  J  ! 

were  computed  for  each  of  the  two  subtests,  where  m  =  7  .  Note 

1/2 

that  the  0-th  moment  is  the  area  under  the  curve  of  [1(0)1 
for  the  interval  of  0  ,  [-3.0,  3.0]  ,  which  is  adjusted  to  unity. 

Since  the  midpoint  of  the  interval,  [-3.0,  3.0]  ,  is  the  origin, 
these  moments  are  also  the  moments  about  the  midpoint,  which  we 
need  in  applying  the  method  of  moments.  These  moments  turned  out 
to  be:  1.00000,  0.00768,  2.73116,  -0.00547,  13.83270,  -0.10637, 
84.67312  and  -0.92245  for  Subtest  1,  and:  1.00000,  0.04742,  3.54786, 
0.10420,  19.44401,  0.38678,  123.79663  and  1.83934  for  Subtest  2. 

The  polynomials  of  degrees  3,  4,  5,  6  and  7  were  obtained  using  the 
method  of  moments,  and  these  five  sets  of  coefficients  are  presented 
in  Table  3-3  for  Subtest  1,  and  in  Table  3-4  for  Subtest  2  (cf. 
Samejiraa  and  Livingston,  1979).  These  five  polynomials  are  shown 
by  dotted  curves  in  Figures  3-1  and  3-2  for  Subtests  1  and  2, 
respectively. 

We  can  see  in  these  ten  graphs  of  Figures  3-1  and  3-2  that, 
although  the  polynomials  fit  fairly  well  to  the  square  roots  of  the 
test  information  functions,  there  still  is  much  to  be  desired, 
especially  for  extreme  values  of  0  .  For  this  reason,  the 
same  process  was  repeated  for  both  Subtests  1  and  2,  using  a 
different  interval  for  the  method  of  moments,  i.e.,  0  =  -4.0  and 
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TABLE  3-3 

Coefficients  of  the  Polynomials  of  Degrees  3  through  7 

Approximating  [I(8)]1//2  ,  Which  Were  Obtained  by  the 
Method  of  Moments  Using  [-3.0,  3.0]  and  [-4.0,  4.0] 
As  the  Interval  of  6  ,  Respectively. 

Subtest  1 


Interval 

[-3.0,  3.0] 

[-4.0,  4.0] 

4.90665 

4.96268 

0.07842 

0.00602 

-0.16475 

-0.18690 

-0.01243 

. 

0.00021 

4.67066 

4.73399 

0.07842 

0.00602 

0.09745 

-0.04398 

-0.01243 

0.00021 

-0.03399 

-0.01042 

4.67066 

4.73399 

0.17323 

0.05956 

0.09745 

-0.04398 

-0.06159 

-0.01541 

-0.03399 

-0.01042 

0.00492 

0.00088 

4.78242 

4.72922 

0.17323 

0.05956 

-0.16329 

-0.03771 

-0.06159 

-0.01541 

0.05290 

-0.01160 

0.00492 

0.00088 

-0.00708 

0.00005 

4.78242 

4.72922 

0.26677 

0.10599 

-0.16329 

-0.03771 

-0.15513 

-0.04152 

0.05290 

-0.01160 

0.02778 

0.00447 

-0.00708 

0.00005 

-0.00157 

-0.00014 

TABLE  3-4 


Coefficients  of  the  Polynomials  of  Degrees  3  through  7 

1/2 

Approximating  [1(6)]  ,  Which  Were  Obtained  by  the 

Method  of  Moments  Using  [-3.0,  3.0]  and  [-4.0,  4.0] 

As  the  Interval  of  e  ,  Respectively. 

Subtest  2 


Interval 

[-3.0.  3.0] 

[-4.0,  4.0] 

0  s 

2.63641 

3.02995 

1  R 

0.22214 

0.10837 

2 

0.25995 

0.10841 

3  ' 

3 

-0.03114 

-0.00924 

0  D 

2.02466 

2.27454 

1  G 

0.22214 

0.10837 

2  R 

0.93968 

0.58054 

3  . 

-0.03114 

-0.00924 

4  4 

-0.08811 

-0.03443 

0  D 

2.02466 

2.27454 

1  c 

0.41951 

0.24669 

2  R 

0.93968 

0.58054 

3  K 

-0.13348 

-0.04958 

4  5 

-0.08811 

-0.03443 

5  J 

0.01023 

0.00227 

0 

2.02136 

2.14813 

1  D 

0.41951 

0.24669 

2  G 

0.94740 

0.74646 

3  R 

-0.13348 

-0.04958 

4  . 

-0.09071 

-0.06554 

5  6 

0.01023 

0.00227 

6 

0.00021 

0.00143 

0 

2.02136 

2.14813 

1  D 

0.60587 

0.37926 

2  G 

0.94740 

0.74646 

-0.31984 

-0.12415 

4  K 

-0.09071 

-0.06554 

l  7 

0.05579 

0.01252 

6  ' 

0.00021 

0.00143 

7 

-0.00313 

-0.00040 

9-4.0  .  The  new  set  of  eight  moments  about  the  origin,  which 
were  computed  through  (3.9),  proved  to  be:  1.00000,  0.01082, 
4.26091,  0,10885,  35.49275,  1.61607,  367.31471  and  24.05220  for 
Subtest  1,  and:  1.00000,  0.02913,  6.01702,  0.03999,  56.94637, 
-0.09788,  633.40916  and  -3.04930  for  Subtest  2.  The  coefficients 
of  the  resultant  five  polynomials  are  also  presented  in  Table  3-3 
for  Subtest  1,  and  in  Table  3-4  for  Subtest  2.  Figures  3-3  and 
3-4  present  the  new  polynomials  of  degree  3,  4,  5,  6  and  7  by 
dotted  curves,  together  with  the  square  root  of  the  test  information 
function,  which  is  shown  by  a  solid  curve,  for  Subtests  1  and  2, 
respectively.  We  can  see  a  substantial  improvement  in  the  fit  of 
polynomials  for  both  subtests,  and,  especially  for  Subtest  1,  the 
polynomial  whose  degree  is  as  low  as  4  already  provides  us  with  an 


excellent  fit. 


Square  Root  of  the  Test  Information  Function,  [1(0)1  ,  (Solid  Line)  and 

the  Polynomial  of  Degree  3  (Dotted  Line),  Which  Was  Fitted  by  the  Method 
of  Moments  with  [-4.0,  4.0j  As  the  Interval  of  9  . 


FIGURE  3-3  (Continued):  Subtest  1,  Polynomial  of  Degree 
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FIGURE  3-3  (Continued):  Subtest  1,  Polynomial  of  Degree 


o 


o 


s 


Q  o.  Q  o.  O 

If)  ^  fO  CM  r 


O 

CO 


o 


o 

04 


o 

CO 


o 

■ 

o 


o 

■ 

I 


o 


<r 

I 


I! 

I© 

O  | 


z/i.[(0)I] 


FIGURE  3-3  (Continued):  Subtest  1,  Polynomial  of  Degree 


re  Root  of  the  Test  Information  Function,  [1(9)]  ,  (Solid  Line)  and 

Polynomial  of  Degree  3  (Dotted  Line),  Which  Was  Fitted  by  the  Method 
of  Moments  with  [-4.0,  4.0]  As  the  Interval  of  0  . 
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ICURE  3-4  (Continued):  Subtest  2,  Polynomial  of  Deg 
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FIGURE  3-4  (Continued):  Subtest  2,  Polynomial  of  Degree 
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IV  Basic  Data  for  Estimating  the  Operating  Characteristics 

We  must  administer  both  Old  Test,  whose  test  information 

function  needs  not  to  be  constant ,  and  the  set  of  new  items,  whose 

operating  characteristics  are  to  be  estimated,  to,  say,  several 

hundred  examinees,  whom  we  sampled  from  an  appropriate  population, 

as  is  the  case  in  the  previous  studies,  in  which  we  used  an  Old 

Test  whose  test  information  function  is  constant.  Let  N  denote 

the  number  of  examinees.  It  is  required  that  the  "known"  test  items 

of  the  Old  Test  follow  a  model,  or  models,  which  provides  us  with  a 

unique  maximum  likelihood  estimate  for  every  possible  response 

pattern  (cf.  Samejima,  1969,  1972). 

Next,  we  must  obtain  the  maximum  likelihood  estimate,  6  , 

of  ability  0  for  every  individual  examinee  from  his  response 

pattern  V  on  the  Old  Test  of  n  items.  When  there  exists  a 

simple  sufficient  statistic  for  the  response  pattern,  as  in  the 

logistic  model  on  the  dichotomous  response  level,  this  process  is 

relatively  simple  and  straight  forward.  That  is  to  say,  in  the 

logistic  model  whose  item  characteristic  function,  P  (9)  ,  or 

6 

the  operating  characteristic  for  x  =1  on  t*ie  dichotomous  response 
level,  is  given  by 

(4.1)  P  (9)  =  [1  +  exp{-1.7  a  (0-b  )}]  ^  , 

8  B  8 

where  a  and  b  are  the  discrimination  and  difficulty  parameters, 

o  6 

respectively,  the  maximum  likelihood  estimate  is  the  solution  of  6 


t 


to  the  equation 


(A. 2) 


t(V)  =  I  ao  P  (8)  , 

g=l  8  8 


where  t(V)  is  a  simple  sufficient  statistic  for  the  response  pattern 
V  which  is  given  by 


(A. 3) 


t(V)  =  £  a  x 


(cf.  Bimbaum,  1968).  When  there  exists  no  sufficient  statistic  for 

the  response  pattern,  as  is  the  case  in  most  situations,  the  maximum 

likelihood  estimate  must  be  obtained  through  a  more  complicated 

n 

numerical  process,  using  [  £  m  +  n]  basic  functions  (Samejima,  1969, 

g=l  8 

1972),  A  (e)  ,  which  is  defined  by 

Xg 


(A. 4) 


A*  <6)  '  M  Px  (6> 

S  g 


for  each  graded  item  response  x  .  Thus  the  maximum  likelihood 

g 

estimate  is  the  solution  to  the  equation. 


(A. 5) 


l  A  (0)  =  0  , 

x^cV  Xg 


which  can  be  obtained  by  the  aid  of  an  electronic  computer  using 
Newton-Raphson  Method. 

The  third  step  is  to  compute  the  test  information  function, 

I(ti)  ,  of  the  Old  Test  through  (2.2),  (2.3)  and  (2.8),  and,  once  it  has 

1/2 

been  done,  its  square  root,  f I ( 0) ]  ,  must  be  computed. 


I 


-43- 


1/2 

Then  we  calculate  the  moments  of  [1(6)]  about  the 

midpoint  of  the  Interval,  {0,  ,  and  apply  the  method  of  moments 

1/2 

to  obtain  the  polynomial  which  approximates  [1(B)]  .  In  so 

doing,  it  is  important  to  adjust  the  endpoints  of  the  interval,  0 

and  $  ,  and  the  degree  of  the  polynomial  m  ,  as  was  illustrated  in 

the  preceding  chapter,  in  order  to  obtain  a  good  approximation. 

Thus  the  (nt+1)  coefficients,  (k=0,l,2, . . . ,m)  ,  in  (3.4)  have 

been  obtained  for  the  Old  Test. 

After  this  has  been  done,  set  the  desired  amount  of  constant 

test  information,  C2  ,  for  the  second  test  information  function, 

I*(t) ,  which  is  to  be  used  after  the  transformation  of  0  to  t  . 

Since  the  normal  approximation  to  the  conditional  distribution  of 

t  ,  given  t  ,  plays  an  essential  role  in  the  estimation  methods, 

this  constant  amount  of  test  information  must  be  substantially  large. 

Next,  we  must  obtain  the  coefficients  a*  (k=0, 1, 2 , . . -m,m+l) 

in  the  transformation  of  9  to  t  ,  which  is  given  by  (3.5).  First, 

determine  the  value  of  r  corresponding  to  the  origin  of  9  ,  and 

use  this  as  d  in  (3.5).  If  we  wish  to  keep  the  position  of  the 

origin  unchanged,  then  set  d  =  0  .  Using  these  two  values  of  C 

(>0)  and  d  thus  obtained,  and  the  coefficients  a[c's  t*ie 

1/2 

polynomial  approximating  [1(B)]  ,  obtain  the  coefficients,  a*  , 

of  the  polynomial  given  by  (3.5)  from  (3.6). 

The  final  step  is  to  obtain  the  maximum  likelihood  estimate 
t  ,  of  the  transformed  latent  trait  t  ,  on  the  Old  Test,  for  each 
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o!  tin-  N  I'xaralnecB.  We  may  do  this  through  the  equnt  ion 
nrt-1 

(4.6)  t  =  £  a*  6k  , 

k=0  k 

'  -bore  u  is  the  maximum  likelihood  estimate  of  6  on  the  Old  Test 
~r  ,  ich  individual  examinee,  which  was  obtained  earlier.  This  set 

A 

cm  the  maximum  likelihood  estimates  T  for  the  total  group  of  N 
examinees  is  the  basic  data  for  each  estimation  process  of  the 
'pi.rating  characteristics  of  the  graded  item  responses,  which  is  to 
e  pii.sented  in  a  later  chapter. 

For  the  purpose  of  illustration.  Figures  4-1  and  4-2  present 
rc  relative  frequency  distributions  of  0  and  t  for  the  five 
: and red  hypothetical  subjects,  respectively,  which  were  obtained 
■cough  Subtest  1.  This  subtest  consists  of  twenty-five  graded 
'eat  items  which  follow  the  normal  ogive  model,  with  the 
.  serimination  and  difficulty  parameters  shown  in  Tables  3-1  and 
-2,  respectively,  as  was  introduced  in  the  preceding  chapter. 

'  "ul  ies  of  6  were  obtained  by  using  the  basic  function  defined 
•  '  •'  (4.4)  for  each  item  score  x  ,  and  as  the  solution  to  the 

h> 

equation  (4.5).  The  transf oruation  of  0  to  t  was  made  through 
s.b)  wTith  m  -  7  ,  in  which  the  coefficients,  a*'s  >  were  based 
on  the  coefficients  a^'s  obtained  by  the  method  of  moments  with 
f  =  -4.0  and  0  =  4 . 0  ,  and  C  =  4.5  .  These  coefficients, 

's  ,  arc-  shown  in  Table  3-3.  As  we  can  see  in  these  two  figures, 
'.he  frequency  distribution  of  t  turned  out  to  be  more  rectangular 


t 


Relative  Frequency  Distribution  of  9  f  which  Was  Obtained  for  the  Five  Hundred  Hypothetical 
Examinees  on  Subtest  1,  with  0.25  as  the  Subinterval  Width,  Together  with  the  Polynomial  of 
Degree  3  Obtained  by  the  Method  of  Moments  to  Approximate  the  Density  Function  of  9  . 


FIGURE  1  (Continued):  Subtest  1,  6  ,  Polvnomial  of  Degree 


FIGURE  4-1  (Continued):  Subtest  1,  0  ,  Polynomial  of  Degree 


FIGURE  4-1  (Continued):  Subtest  1,  0  ,  Polynomial  of  Degree 
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FIGURE  4-2  (Continued):  Subtest  1,  x  ,  Polynomial  of  Degree 


FIGURE  4-2  (Continued):  Subtest  1,  t.  ,  Polynomial  of  Degree 
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FIGURE  4-2  (Continued):  Subtest  1,  t  ,  Polynomial  of  Degree 


than  that  of  §  ,  although  they  are  similar  in  shape.  To  make  the 
difference  between  the  two  frequency  distributions  more  visible, 
five  polynomials  of  degrees  3,  4,  5,  6  and  7  were  obtained  by  the 
method  of  moments  to  approximate  each  of  the  density  functions  of 
0  and  t  ,  and  were  drawn  by  solid  lines  in  the  five  graphs  of 
each  of  Figures  4-1  and  4-2,  along  with  the  corresponding  frequency 
distribution.  We  note  that,  except  for  the  polynomial  of  degree  3 
in  each  figure,  the  four  approximated  density  functions  are  very 
similar  to  one  another,  and  they  are  closer  to  a  rectangle  for  T 
than  those  for  9  .  Since  the  method  of  moments  was  applied  for 
a  set  of  observations,  instead  of  some  empirical  function,  the  O-th 
through  seventh  moments  about  the  origin  were  computed  directly 
from  the  observations,  and  they  turned  out  to  be  1.00000,  -0.00472, 
2.19052,  -0.04378,  9.17620,  -0.52428,  48.47210  and  -4.96487  for 
6  ,  and  1.00000,  0.00479,  2.12231,  -0.02483,  8.51515,  -0.35195, 
42.31180  and  -2.77758  for  T  .  The  interval  of  §  used  for  the 
method  of  moments  is  [-2.9843,2.9904]  and  that  of  t  is  [-3.0479, 
2.8681].  The  coefficients  of  these  ten  polynomials  are  presented 
in  Table  4-1. 

Figures  4-3  and  4-4  present  corresponding  frequency 
distributions  and  the  polynomials  of  degrees  3,  4,  5,  6  and  7 
obtained  through  Subtest  2,  respectively.  This  subtest  also  consists 
of  twenty-five  graded  test  items  following  the  normal  ogive  model, 
but  ten  of  the  items  are  different  from  those  which  are  used  in 
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TABLE  4-1 

Coefficients  of  the  Two  Sets  of  Polynomials  of  Degrees  3 
Through  7,  Which  Were  Obtained  by  the  Method  of  Moments 
to  Approximate  the  Density  Functions  of  6  and  f  . 
Respectively.  The  Maximum  Likelihood  Estimation  Is 
Based  on  Subtest  1. 


Coefficient 

for 

e 

0  l 

0.22252 

0.21204 

0.00090 

-0.00092 

2  R 

-0.01854 

-0.01463 

3  3 

-0.00023 

0.00016 

0  D 

0.19916 

0.18470 

1  G 

0.00074 

-0.00198 

2  R 

0.00765 

0.01688 

3  . 

-0.00019 

0.00044 

4  4 

-0.00342 

-0.00424 

0  D 

0.19918 

0.18487 

1  g 

-0.00609 

-0.01220 

2  R 

0.00761 

0.01661 

3  R 

0.00339 

0.00594 

*  5 

-0.00342 

-0.00419 

5  5 

-0.00036 

-0.00057 

0 

0.18920 

0.18183 

1  D 

-0.00623 

-0.01244 

2  G 

0.03108 

0.02397 

3  R 

0.00348 

0.00611 

4  . 

-0.01131 

-0.00674 

5  6 

-0.00037 

-0.00059 

6 

0.00065 

0.00022 

0 

0.18922 

0.18198 

1  D 

-0.01305 

-0.02135 

2  G 

0.03102 

0.02351 

3  R 

0.01036 

0.01535 

4  R 

-0.01128 

-0.00654 

5  7 

-0.00207 

-0.00294 

6  ' 

0.00065 

0.00020 

7 

0.00012 

0.00017 

Relative  Frequency  Distribution  of  6  ,  Which  Was  Obtained  for  the  Five  Hundred  Hypothetical 
Examinees  on  Subtest  2,  with  0.25  as  the  Subinterval  Width,  Together  with  the  Polynomial  of 
Degree  3  Obtained  by  the  Method  of  Moments  to  Approximate  the  Density  Function  of  6  . 


Relative  Frequency  Distribution  of  x  ,  Which  Was  Obtained  for  the  Five  Hundred  Hypothetical 
Examinees  on  Subtest  2,  with  0.25  as  the  Subinterval  Width,  Together  with  the  Polynomial  of 
Degree  3  Obtained  by  the  Method  of  Moments  to  Approximate  the  Density  Function  of  x  . 


FIGURE  4-4  (Continued):  Subtest  2,  x  ,  Polynomial  of  Degree 
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Subtest  1,  as  is  shown  in  Tables  3-1  and  3-2.  Just  as  in  the  case 
of  Subtest  1,  the  transformation  of  0  to  t  was  made  through  (4.6) 
with  m  =  7  ,  and  the  interval  used  for  obtaining  the  coefficients 
a^'s  in  the  method  of  moments  is  [-4.0,  4.0].  The  coefficients 
a*'s  thus  obtained  are  shown  in  Table  3-4.  The  amount  of  the 
constant  test  information  for  t  is  different,  however,  and  we 
used  C  =  3.5  instead  of  C  =  4.5  . 

It  is  noted  that  the  two  frequency  distributions  of  6  , 
which  were  obtained  through  Subtests  1  and  2,  respectively,  are 
substantially  different  from  each  other,  and  so  is  the  case  with 
those  of  t  .  Although  the  latter  is  reasonable  because  of  the 
difference  in  the  two  transformations  of  0  to  t  ,  the  two 
frequency  distributions  of  §  should  not  be  so  different  since 
they  are  both  the  estimates  of  the  same  0  for  the  same  group  of 
five  hundred  examinees.  If  we  focus  our  attention  on  the  polynomials 
approximating  the  density  function  of  0  ,  however,  we  notice  that 
the  two  sets  of  polynomials  of  degree  4  or  greater  are  almost  identical 
In  each  of  Figures  4-3  and  4-4,  the  approximated  polynomials 
are  very  similar,  except  for  the  one  with  degree  3,  as  was  the  case 
with  those  obtained  through  Subtest  1.  These  approximated  density 
functions  are  steeper  for  t  than  for  3  ,  and  the  difference  is 
greater  than  in  the  case  of  Subtest  1.  The  0-th  through  seventh 
moments  about  the  origin  for  0  are  1.00000,  0.00694,  2.31594,  0.07941 
9.95147,  0.41052,  52.81177  and  2.12395,  and  those  for  t  are  1.00000, 


0.06363,  1.48640,  0.41654,  5.19558,  2.54982,  24.35844  and  16.73911. 
The  interval  of  §  used  in  the  method  of  moments  is  [-2.9290, 
2.9625],  and  that  of  t  is  [-2.9315,  2.9160].  The  coefficients  of 
these  polymomials  are  presented  in  Table  4-2. 


TABLE  4-2 

Coefficients  of  the  Two  Sets  of  Polynomials  of  Degrees 
Through  7,  Which  Were  Obtained  by  the  Method  of  Moment 
to  Approximate  the  Density  Functions  of  6  and  t  , 
Respectively.  The  Maximum  Likelihood  Estimation  Is 
Based  on  Subtest  2. 


Coefficient 

for 

A 

e 

Coefficient 

for 

A 

X 

0  r 

0.22600 

0.27318 

1  R 

-0.00098 

-0.00149 

2  R 

-0.01935 

-0.03584 

3  3 

0.00057 

0.00102 

0  D 

0.19975 

0.29301 

1  G 

0.00445 

-0.00185 

2  R 

0.01073 

-0.05903 

3  . 

-0.00089 

0.00112 

4  4 

-0.00404 

0.00317 

o  D 

0.19932 

0.29291 

1  G 

-0.00026 

-0.01481 

2  R 

0.01141 

-0.05887 

3  R 

0.00164 

0.00819 

A  s 

-0.00415 

0.00314 

5  5 

-0.00026 

-0.00074 

0 

0.19785 

0.29859 

1  D 

0.00039 

-0.01503 

2  G 

0.01496 

-0.07282 

3  R 

0.00120 

0.00834 

4  . 

-0.00538 

0.00803 

5  6 

-0.00021 

-0.00076 

6 

0.00010 

-0.00042 

o 

0.19707 

0.29845 

1  D 

-0.00813 

-0.03297 

2  G 

0.01737 

-0.07238 

3  R 

0.01000 

0.02724 

4  R 

-0.00639 

0.00784 

s  ; 

-0.00244 

-0.00563 

6  7 

0.00020 

-0.00040 

7 

0.00016 

0.00035 

-70- 


V  Conditional  Moments  of  the  Maximum  Likelihood  Estimate  t  and 
the  Three  Methods  of  Approximating  the  Conditional  Density 

Let  A  be  an  estimator  of  x  ,  and  n  be  the  error  of 

estimation.  We  assume  that  the  conditional  distribution  of  n  , 

given  t  ,  is  normal,  with  0  and  a  as  the  two  parameters,  and 

A  is  given  by  the  simple  sum  of  t  and  n  ,  such  that 

(5.1)  A  =  t  +  n  . 


We  obtain  for  the  first  four  conditional  moments  of  t  about  the 
origin,  given  A  , 

(5.2)  E(t  |  A)  ”  A  +  a2  ^  log  g(A)  , 

(5.3)  E(x2  |  A)  =  A2  +  2 Ao2  ^logg(A)  +  a4[^yr  log  g(A) 

+  log  g(A)}2]  +  a2  , 

(5.4)  E(t3|A)  =  a6[^p-  i°g  b( A)]  , 

and 

(5.5)  E(t4  j  A)  *  o4[3  +  6o2{-^t-  log  g(A)}  +  3ah{~2  log  g(A))2 

+  log  g(A)  }]  , 


where  g(A)  is  the  marginal  density  function  of  A  . 

By  virtue  of  the  fact  that  I*(t)  =  C2  and  that  the  asymptotic 

conditional  distribution  of  the  maximum  likelihood  estimate  t  ,  given 

-1/2 

t  ,  is  the  normal  distribution  with  t  and  [I*(t)] 


as  the 
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parameters  (Samejima,  1975),  we  can  write  for  the  first  four  conditional 
moments  of  t  about  the  origin,  given  t  , 

(5.6)  E(t  |  t)  =  t  +  C-2-^-  log  g (t)  , 

2 

(5.7)  E(t2|t)  =  t2+  2tC-2  ~  log  g ( T )  +  C~4[-^plog  g(i) 

+  {-—r-  log  g(x)}2]  +  C-2  , 

(5.8)  E(t3|t)  =  C  6[^r  log  g(f)]  , 

(5.9)  E(T4jr)  *=  C~4[3  +  6C~2  {^pr  log  g(t)}  +  3C  4{-ppr  log  g(t)}2 

+  C~4{-p^r  log  g(x)  }  ]  , 

where  g (t)  is  the  marginal  density  function  of  t  . 

The  formulas  (5.6)  through  (5.9)  imply  that,  since  the  set  of 
N  maximum  likelihood  estimates,  t  ,  is  available  as  our  basic  data, 

A 

these  conditional  moments  can  solely  be  estimated  from  g(x)  ,  provided 
that  we  can  approximate  this  marginal  density  function  by  fitting  an 
appropriate  four-time  differentiable  function  to  the  set  of  N  t's  . 

This  has  been  done  in  the  previous  studies  using  6  instead  of  i  , 
by  adopting  a  polynomial  of  degree  3  or  4,  which  was  obtained  by 
the  method  of  moments. 

After  these  conditional  moments  have  been  obtained,  which  are 
functions  of  t  ,  we  can  fit  some  appropriate  function  for  the 
conditional  density  function  of  t  ,  given  t  .  In  the  Normal 
Approach  Method,  only  the  first  two  conditional  moments  are  used. 


t 
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and  the  normal  density  function  is  fitted  for  the  conditional  distribution 
with  E(x|x)  and  [E(x2jx)  -  (E(t|t)}2]  as  the  parameters.  For 
simplicity,  let  y^  be  the  first  conditional  moment  of  t  about 
the  origin,  and  be  t*'e  second  conditional  moment  of  t  about 

the  mean,  given  t  ,  respectively.  Thus  the  approximated  conditional 
density  function,  $(x|x)  ,  in  the  Normal  Approach  Method  is  given  by 

(5.10)  <Kx|x)  =  ( 2 ny 2 )  1^2  exp [ - (r-y|) 2/ (2y2> ] 

In  the  Pear son-System  Method ,  all  of  the  above  four  conditional 
moments  are  used.  For  simplicity,  let  y^  and  y^  denote  the  third 
and  fourth  conditional  moments  of  x  about  the  mean,  given  x  , 
adding  to  the  symbols,  y^  and  y^  .  Pearson’s  criterion  <  (Elderton 
and  Johnson,  1969;  Johnson  and  Kotz,  1970)  is  defined  by 

(5.11)  k  =  61(62+3)2[4(262-361-6)(4Bz-3B1)]"1  , 

where  B^  and  B2  are  given  by 

(5.12)  B1  =  y2y~3 
and 

(5.13)  B2  =  yAy22  . 

Depending  upon  the  value  of  tc  ,  one  of  the  Pearson  type  distributions 
is  assigned  as  the  approximation  to  the  conditional  distribution  of  x  , 
given  x  .  For  different  values  of  x  ,  therefore,  possibly  different 
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types  of  Pearson  distributions  are  assigned,  and  we  have  varieties 
of  different  types  of  density  functions  for  .  If,  for 

instance,  k  <  0  ,  then  the  distribution  assigned  is  the  Beta 
distribution,  whose  density  function  is  given  by  the  formula 

(5.14)  <p  (t  j  t)  =  [B(p-,q-)]  1  (r-a*)Pr  1(b^-T)'1T  1(b--a')  ^PT+qt  ^ 

in  which  the  four  parameters,  p~  ,  q-  ,  a-  ,  and  b-  ,  are  estimated 
from  the  four  conditional  moments,  such  that 


(5.15) 

p; 

,  q«  =  (r/2) [1  ±  (r+2){B1t61(r+2)2  +  16(r+l) ]_1 }1/2] 

(5.16) 

b- 

T 

-  a*  *  p21/2[B1(r+2)2  +  16(r+l)]1/2/2  , 

(5.17) 

a  - 

T 

“  uj  -  p-(b~-au)/r  , 

and 

(5.18) 

T 

=  +  q-(6'-a-)/r  , 

where  r 

is  defined  as 

(5.19) 

r  s 

■  6(B2-B1-1)(6+3S1-2B2)“1  . 

If  K  =  0 

,  which  results  from  B.  *  0  and  <  3  ,  the  distribution 

is  a  special  case  of  Beta  distribution  in  which  the  density  function 
is  symmetric,  and  two  parameters,  p-  and  q-  ,  are  equal,  such  that 


(5.20) 
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If  k  *=  0  ,  which  is  resultant  from  ■=  0  and  =  3  *  then  the 

normal  distribution  is  assigned,  whose  density  function  is  given 
by  (5.10).  If  k  >  1  ,  then  the  distribution  is  of  Pearson's  Type 
VI,  and,  if  0  <  «  <  1  ,  then  the  distribution  is  of  Pearson's 
Type  IV,  and  so  forth. 

The  advantage  of  Pearson-Systera  Method  over  the  other  two  methods 
is  that  it  makes  full  use  of  the  four  estimated  conditional  moments 
of  t  ,  given  t  ,  without  restricting  the  conditional  distributions 
to  a  single  type.  It  has  its  disadvantage,  however,  since  in  some 
cases  the  estimation  of  the  higher  conditional  moments  is  fairly 
inaccurate  for  some  range  of  t  ,  and  also  the  estimation  of  the 
parameters  of  some  Pearson  type  distributions  is  difficult. 

In  the  Two-Parameter  Beta  Method ,  the  Beta  distribution  is 
adopted  for  the  conditional  distribution  of  t  ,  given  t  ,  whose 
density  function  is  given  by  (5.14).  Two  parameters,  a-  and  b-  , 
are  preassigned  for  each  t  in  some  appropriate  method,  and  the 
other  two  parameters,  p~  and  ,  are  estimated  by 

(5.21)  p-  =  M^U-M^M^1  -  M1 
and 

(5.22)  q-  =  M1(1-M1)2M21  -  (1-M^  , 

where 

M]  =  (u]~a^) (b--a-)  1 


(5.23) 


and 

(5.24)  M2  =  M2(b~-a~) 

This  method  has  an  advantage  over  the  Normal  Approach  Method  in  the 
sense  that,  unlike  the  normal  density  function,  the  Beta  density 
function  provides  us  with  varieties  of  different  curves  depending 
upon  the  values  of  the  parameters.  Its  disadvantage  is,  however, 
that  we  have  an  additional  work  of  finding  an  appropriate  finite 
interval,  [a-,  b-]  . 

I  T 


* 
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VI  Histogram  Rat io  and  Curve  Fitting  Approaches 

The  two  approaches  discussed  here,  as  well  as  Conditional 
P.D.F.  Approach,  make  full  use  of  the  approximated  density  function, 
g(x)  ,  which  is  obtained  on  the  entire  set  of  N  t's  .  The 
conditional  moments  of  t  ,  given  t  ,  are  obtained  by  (5.6) 
through  (5.9),  using  this  approximated  density  function  for  g(t)  . 

We  calibrate  a  certain  number  of  t  for  each  of  the  N  t's, 
through  the  Monte  Carlo  method,  in  accordance  with  the  approximated 
conditional  density  function  of  t  ,  given  .  This  approximated 
density  function,  ( t  |  x )  ,  can  be  a  normal  density  function,  a  Beta 

density  function,  or  one  of  the  Pearson  System  density  functions, 
depending  upon  which  of  the  three  methods,  i.e.,  Normal  Approach 
Method,  Two-Parameter  Beta  Method  and  Pear son-System  Method,  we 
choose.  Let  t  denote  these  calibrated  t's  ,  and  v  be  the 
number  of  t's  calibrated  for  each  of  examinee  i  .  Thus  we 

obtain  (vxN)  t's  in  total.  We  classify  these  t's  into  (m^+1) 
item  score  groups,  where  h  is  a  new  test  item  whose  operating 
characteristics  are  to  be  estimated,  depending  upon  the  item  score 
(=0, 1, . . . ,m^)  the  specific  examinee  obtained  for  item  h  .  Then 
each  i  is  transformed  to  0  ,  through 

(6.1)  e  =  rx[T(9)]  . 


When  x(  )  is  given  by  the  polynomial  given  by  (3.5),  for  example,  this 
process  can  easily  be  performed  by  Newton-Raphson  Method. 
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In  the  Histogram  Ratio  Approach,  these  (vxN)  0's  arc 

categorized  into  intervals  of  small,  equal  widths.  The  ratio  of 

the  frequency  of  g's,  which  belong  to  examinees  whose  item  score  to 

item  h  is  x^  ,  to  the  total  frequency,  in  each  subinterval  of  t  , 

provides  us  with  the  estimated  operating  characteristic,  P  (e)  . 

Xh 

Let  H  ($es)  denote  the  frequency  of  u's  ,  which  belong  to  the 
Xh 

item  score  group  ,  for  the  subinterval  s  ,  whose  midpoint  is 

6  .  Then  we  can  write 

s 


(6.2) 


m,  , 

p  (e  )  =  h  (ecs)t  i  h  (ecs)]-1 

Xh  Xh  j=0  J 


x^  0,1,..., m^  ^ 


In  order  to  obtain  a  smooth  curve  for  this  estimated  operating 

characteristic,  it  is  advisable  to  use  a  fairly  large  number  for  v  , 

and  a  small  width  for  the  subinterval  s  of  0  . 

In  the  Curve  Fitting  Approach,  a  polynomial  of  a  certain  degree 

is  fitted  by  the  method  of  moments,  to  the  subset  of  9’s  for  each 

item  score  group  .  Then  the  ratio  of  the  resultant  polynomial 

to  the  sum  of  (rn^+1)  such  polynomials  is  taken,  and  this  ratio 

provides  us  with  the  estimated  operating  characteristic  of  the 

item  response  x^  .  Let  q  (6)  be  such  a  polynomial  for  the  item 

h 

score  group  x^  .  We  obtain  for  the  estimated  operating  characteristic, 
P  (9)  ,  such  that 


CL 

p,  (e)  =  n  ( e) [  i  n.(e)] 
Xh  h  j=0  3 


x,  0,  x,  .  .  . , m, 
h  h  ' 


(6.3) 
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VII  Conditional  P.D.F.  Approach 

In  this  approach,  we  specify  the  exact  function  of  the 
approximated  conditional  density,  4>(t|t)  ,  using  the  parameters 
estimated  from  the  approximated  density  function  g(t)  ,  (cf.  Chapter  5). 
Again,  this  approximation  to  the  conditional  density  function,  $ ( t  |  T )  , 
can  be  a  normal  density  function,  a  Beta  density  function,  or  one  of 
the  Pearson  System  density  functions,  depending  upon  which  one  of  the 
Normal  Approach  Method,  the  Two-Parameter  Beta  Method,  and  the  Pearson- 
System  Method  we  choose. 

In  the  Simple  Sum  Procedure,  these  specified,  approximated 

conditional  density  functions  are  categorized  into  the  (m^+1)  item 

score  groups  for  a  new  item  h  ,  whose  operating  characteristics  are 

to  be  estimated,  depending  upon  the  item  score  x^  (=0, 1 , 2 , . . . ,mh) 

that  each  examinee  has  obtained.  By  virtue  of  (2.10),  the 

transformation  of  j  to  p  is  made  through  (6.1),  and  the  estimated 

operating  characteristic,  P  (6)  ,  is  given  by 

Xh 

N  .  .  _i 

(7-D  p  (e)  =  Z  4>(t  j  X.)  [  Z  4>(t  |  t  .)  ]  ,  X  *=  0,1,. ...m, 

h  i«h  1  i-1  1  h  * 

where  i  denotes  an  individual  examinee  and  is  the  maximum 

likelihood  estimate  of  t  for  the  individual  i  . 

In  the  Weighted  Sum  Procedure,  the  estimated  operating 


where  w(t^)  is  an  appropriate  weight  assigned  to  the  maximum 
likelihood  estimate  t  for  the  individual  examinee  i  .  Simple  Sum 
Procedure  can  be  considered,  therefore,  as  a  special  case  of  the 
Weighted  Sum  Procedure,  in  which  w(t^)  =  1  for  all  the  individual 
examinees.  Another  example  of  such  a  weight,  w(-0  ,  is  the  area  unde 
the  approximated  density  function,  g(t)  ,  for  the  interval  of  t 
which  starts  from  the  midway  between  x^  and  the  lower  adjacent 

and  ends  with  the  midway  between  x^  and  the  upper  adjacent 
t.  .  The  transformation  of  x  to  0  in  (7.2)  can  be  made  through 
(6.1),  as  in  the  Simple  Sum  Procedure. 

We  have  a  somewhat  different  rationale  behind  the  Proportioned 

Sum  Procedure.  Let  p(iex^)  be  the  probability  with  which  examinee 

i  belongs  to  the  item  score  group  .  We  can  write  for  the 

estimated  operating  characteristic,  P  (0)  ,  of  the  item  response 

*h 

x,  to  a  new  item  h 
n 

N  ^  N 

(7.3)  P  (0)  =  I  p(iex,)  $(x|t.)  [  1  $  (  t  1  t  - )  ]  > 

xh  i=l  n  i=l  x 

\  —  0 , 1 ,  .  •  • ,  m^ 


where  p(itx^)  is  the  estimate  of  the  probability  p(icx^)  ,  which 
satisfies 


(7.4) 


V° 


?h  .  ?h  . 

I  p(icx^)  =  E  p(iex^) 

xh=° 


One  example  of  this  proportional  weight,  p(icx^)  ,  is  the  proportion 
of  examinees  who  belong  to  the  item  score  group  x,  within  a  specified 
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interval  of  t  for  which  t±  is  the  midpoint.  The  transformation  of 
t  to  6  in  (7.3)  is,  again,  made  through  (6.1). 
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VIII  Bivariate  P . D . F ■  Approach 

In  contrast  to  the  other  three  approaches,  Bivariate  P.D.F. 
Approach  makes  use  of  the  estimated  bivariate  density  function,  rather 
than  the  estimated  conditional  density  function,  $ ( t  |  'r )  .  Let 
£(t,t)  denote  the  bivariate  density  function  of  t  and  x  •  We 
can  write 

(8.1)  £(x,x)  =  4> ( t  |  x)  g(x)  . 

We  classify  the  set  of  N  x^'s  into  (m^+1)  item  score 

categories,  depending  upon  the  item  score  (=0, 1 , . . . ,m,  )  the 

examinee  i  obtained  for  a  new  test  item  h  ,  for  which  the  operating 

characteristics  are  to  be  estimated. 

The  method  of  moments  is  applied  for  each  of  these  (m^+1) 

subsets  of  x  ,  and  the  density  function,  g  (t)  ,  is  estimated 

Xh 

for  each  subgroup.  The  conditional  moments  of  t  ,  given  x  , 

are  also  obtained  for  separate  subgroups,  using  the  formulas  (5.6) 

through  (5.9).  Based  on  these  estimated  conditional  moments,  the 

parameters  of  a  specific  density  function,  which  is  adopted  for  4> ( t | t )  , 

are  obtained  for  each  subgroup  .  The  choice  of  $(x|x)  depends 

upon  which  of  the  three  methods,  i.e..  Normal  Approach  Method, 

Two-Parameter  Beta  Method  and  Pearson-Sy stem  Method,  is  taken.  The 

bivariate  density  function  of  t  and  x  is  obtained  from  (8.1) 

for  each  of  the  (m,+l)  subgroups.  Let  £  (x,x)  denote  the 

*h 

estimated  bivariate  density  function  of  x  and  x  for  the  subgroup 
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The  transformation  of  t  to  6  in  (3.2)  is  again  made  through  (6.1). 

There  is  a  somewhat  different  approach  which  also  belongs  to 
the  Bivariate  P.D.F.  Approach  (Samejima,  1977c),  which  is  called 
Normal  Approximation  Method.  In  this  method,  the  estimation  of  the 
density  function,  g(t)  ,  is  not  necessary.  We  approximate  £  (t,t)  , 

bivariate  density  function  of  t  and  t  for  each  item  score  group 
,  by  a  bivariate  normal  density  function  (e.g.,  Anderson,  1958). 
whose  parameters  are  estimated  from  our  observations.  The  regression 
of  t  on  t  is  estimated  by  the  least  squares  method,  which  provides 
us  with 

(8.3)  E(t|t)  =  [ l-C-2(Var . (t) }_^]t  +  C'2 [Var . (r) ]_1  E(t)  , 

where  E(t)  and  Var.(t)  denote  the  expectation  and  the  variance  of 
T  for  the  subgroup  .  The  conditional  variance  of  t  ,  given 
t  ,  is  obtained  by 

(8.4)  Var . ( t | t)  =  C-2 [ 1-c'2 {Var . (O  }_1 )  . 

The  estimated  operating  characteristic,  P  (6)  ,  can  be  obtained 

xh 

f\, 

either  through  the  Monte  Carlo  Calibration  of  t  and  the  procedure 


similar  to  the  Histogram  Ratio  Approach  or  the  Curve  Fitting  Approach, 
or  by  the  ratio  of  the  integral  of  the  bivariate  density  function  for 
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IX  Discussion  and  Conclusions 

The  rationale  behind  the  methods  and  approaches  for  estimating 
the  operating  characteristics  of  the  graded  item  responses  when  the 
test  information  function  of  the  Old  Test  is  not  constant,  and  the 
outline  of  their  procedures,  are  presented.  It  has  been  shown  that  the 
generalization  of  our  old  methods  and  approaches  to  the  above  situation 
is  relatively  simple  and  straightforward,  at  least,  in  theory.  Since 
the  elimination  of  the  restriction  of  the  constant  amount  of  test 
information  will  provide  us  with  a  great  deal  of  benefit  in  the 
applicability  of  the  methods  and  approaches,  especially  in  the  paper- 
and-pencil  situation,  this  generalization  of  the  methods  and  approaches 
may  make  a  great  deal  of  contribution  to  researchers  in  psychometrics 
and  applied  psychological  measurement. 

We  need  carefully  designed  simulation  studies,  however,  before 
using  these  methods  and  approaches  for  empirical  data,  and  to  observe 
how  these  procedures  work.  It  is  anticipated  that,  for  the  range  of 
0  where  the  te6t  information  function,  1(6)  ,  of  the  Old  Test  assumes 
low  values,  the  estimation  of  the  operating  characteristics  is  less 
accurate,  compared  with  the  one  which  is  based  upon  the  Old  Test 
having  a  constant  amount  of  test  information.  It  may  be  especially 
so  for  both  lower  and  higher  extreme  values  of  0  when  the  test 
information  function  is  of  bell  shape,  as  it  is  for  Subtest  1,  which 
was  introduced  in  earlier  chapters.  Comparison  of  the  results  using 
different  types  of  test  information  functions,  as  those  of  Subtests  1 
and  2  in  the  present  paper,  will  be  meaningful. 


,  •• 
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